Memory efficient sanitization of a deduplicated storage system
نویسندگان
چکیده
Sanitization is the process of securely erasing sensitive data from a storage system, effectively restoring the system to a state as if the sensitive data had never been stored. Depending on the threat model, sanitization could require erasing all unreferenced blocks. This is particularly challenging in deduplicated storage systems because each piece of data on the physical media could be referred to by multiple namespace objects. For large storage systems, where available memory is a small fraction of storage capacity, standard techniques for tracking data references will not fit in memory, and we discuss multiple sanitization techniques that trade-off I/O and memory requirements. We have three key contributions. First, we provide an understanding of the threat model and what is required to sanitize a deduplicated storage system as compared to a device. Second, we have designed a memory efficient algorithm using perfect hashing that only requires from 2.54 to 2.87 bits per reference (98% savings) while minimizing the amount of I/O. Third, we present a complete sanitization design for EMC Data Domain.
منابع مشابه
Similarity and Location Aware Scalable Deduplication System for Virtual Machine Storage Systems
I.INTRODUCTION In this paper with the potentially unlimited storage space offered by cloud providers, users tend to use a large amount space as they can and vendors continually look for techniques aimed to reduce redundant data and exploit space savings. A technique which has been widely adopted is crossuser deduplication. The simple idea behind deduplication is to accumulate duplicate data onl...
متن کاملEffects of Memory Randomization, Sanitization and Page Cache on Memory Deduplication
Memory deduplication merges same-content memory pages and reduces the consumption of physical memory. It is a desirable feature for virtual machines on IaaS (Infrastructure as a Service) type cloud computing, because IaaS hosts many guest OSes which are expected to include many identical memory pages. However, some security capabilities of the guest OS modify memory contents for each execution ...
متن کاملReliably Erasing Data from Flash-Based Solid State Drives
Reliably erasing data from storage media (sanitizing the media) is a critical component of secure data management. While sanitizing entire disks and individual files is well-understood for hard drives, flash-based solid state disks have a very different internal architecture, so it is unclear whether hard drive techniques will work for SSDs as well. We empirically evaluate the effectiveness of ...
متن کاملDupLESS: Server-Aided Encryption for Deduplicated Storage
Cloud storage service providers such as Dropbox, Mozy, and others perform deduplication to save space by only storing one copy of each file uploaded. Should clients conventionally encrypt their files, however, savings are lost. Message-locked encryption (the most prominent manifestation of which is convergent encryption) resolves this tension. However it is inherently subject to brute-force att...
متن کاملDelta Compressed and Deduplicated Storage Using Stream-Informed Locality
For backup storage, increasing compression allows users to protect more data without increasing their costs or storage footprint. Though removing duplicate regions (deduplication) and traditional compression have become widespread, further compression is attainable. We demonstrate how to efficiently add delta compression to deduplicated storage to compress similar (nonduplicate) regions. A chal...
متن کامل